Separable representations for cocktail party processing

نویسنده

  • Alain de Cheveigné
چکیده

Perceptual parsing of a complex acoustic scene requires the following ingredients: (a) a representation that is “separable” in the sense that it allows patterns to be split into parts that belong to diverse sources, (b) rules and cues to guide the partition, and (c) processes that can make sense of the partitioned information in the event that the partition was imperfect and the partitioned information is incomplete. Acoustic waveforms can usefully be represented in the time domain, in the frequency domain, in the cepstral domain, etc. These domains are transforms one of the other, and for tasks such as recognizing speech in quiet they may in principle be used interchangeably. In the presence of noise or competing sources, however, partitioning may be easier within one representation than within others. This paper explores the idea that the auditory system forms a set of diverse representations that are redundant in quiet, but useful in cluttered acoustic scenes. These representations are produced by a combination of spectral analysis in the cochlea and time-domain neural processing within the nervous system. Partitioning is based on cues and rules such as described by Bregman (1990). The final pattern-matching stage must be able to use the possibly incomplete information that survives the partitioning stage. Similar ideas may be applied to machine analysis of complex acoustic scenes.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Biologically Motivated Solution to the Cocktail Party Problem

We present a new approach to the cocktail party problem that uses a cortronic artificial neural network architecture (Hecht-Nielsen, 1998) as the front end of a speech processing system. Our approach is novel in three important respects. First, our method assumes and exploits detailed knowledge of the signals we wish to attend to in the cocktail party environment. Second, our goal is to provide...

متن کامل

Deep Transform: Cocktail Party Source Separation via Probabilistic Re-Synthesis

In cocktail party listening scenarios, the human brain is able to separate competing speech signals. However, the signal processing implemented by the brain to perform cocktail party listening is not well understood. Here, we trained two separate convolutive autoencoder deep neural networks (DNN) to separate monaural and binaural mixtures of two concurrent speech streams. We then used these DNN...

متن کامل

Improved Cocktail - Party Processing

The human auditory system is able to focus on one speech signal and ignore other speech signals in an auditory scene where several conversations are taking place. This ability of the human auditory system is referred to as the “cocktail-party effect”. This property of human hearing is partly made possible by binaural listening. Interaural time differences (ITDs) and interaural level differences...

متن کامل

L'amorçage sémantique masqué en situation de cocktail party (Masked semantic priming in cocktail party situation) [in French]

________________________________________________________________________________________________________ Masked semantic priming in cocktail party situation The present study aimed at testing automatic semantic processing in the auditory modality using the cocktail party situation. Participants had to perform a lexical decision task on a target item embedded in a multi-talker babble. This babbl...

متن کامل

Cocktail Party Processing via Structured Prediction

While human listeners excel at selectively attending to a conversation in a cocktail party, machine performance is still far inferior by comparison. We show that the cocktail party problem, or the speech separation problem, can be effectively approached via structured prediction. To account for temporal dynamics in speech, we employ conditional random fields (CRFs) to classify speech dominance ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005